{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Note\n",
    "\n",
    "View the README.md [here](https://github.com/eclipse/deeplearning4j-examples/blob/master/tutorials/README.md) to learn about installing, setting up dependencies and importing notebooks in Zeppelin"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### DL4J Network Architectures\n",
    "---\n",
    "\n",
    "DL4J provides the following classes to configure networks:\n",
    "1. 'MultiLayerNetwork'\n",
    "2. 'ComputationGraph'\n",
    "\n",
    "\n",
    "### 1. MultiLayerNetwork\n",
    "'MultiLayerNetwork' consists of a single input layer and a single output layer with a stack of layers in between them.\n",
    "\n",
    "### 2. ComputationGraph\n",
    "'ComputationGraph' is used for constructing networks with a more complex architecture than 'MultiLayerNetwork'. \n",
    "It can have multiple input layers, multiple output layers and the layers in between can be connected through a direct acyclic graph.\n",
    "\n",
    "---\n",
    "\n",
    "### Network Configurations\n",
    "Whether you create 'MultiLayerNetwork' or 'ComputationGraph', you have to provide a network configuration to it through 'NeuralNetConfiguration.Builder'.\n",
    "'NeuralNetConfiguration.Builder', as the name implies, provides a Builder pattern to configure a network.\n",
    "To create a 'MultiLayerNetwork', we build a 'MultiLayerConfiguraion' and for 'ComputationGraph', it's 'ComputationGraphConfiguration'.\n",
    "\n",
    "The pattern goes like this: [High Level Configuration] -> [Configure Layers] -> [Pretraining and Backprop Configuration] -> [Build Configuration]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Required imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "autoscroll": "auto"
   },
   "outputs": [],
   "source": [
    "import org.deeplearning4j.nn.api.OptimizationAlgorithm\r\n",
    "import org.deeplearning4j.nn.conf.graph.MergeVertex\r\n",
    "import org.deeplearning4j.nn.conf.layers.{DenseLayer, GravesLSTM, OutputLayer, RnnOutputLayer}\r\n",
    "import org.deeplearning4j.nn.conf.{ComputationGraphConfiguration, MultiLayerConfiguration, NeuralNetConfiguration}\r\n",
    "import org.deeplearning4j.nn.graph.ComputationGraph\r\n",
    "import org.deeplearning4j.nn.multilayer.MultiLayerNetwork\r\n",
    "import org.deeplearning4j.nn.weights.WeightInit\r\n",
    "import org.nd4j.linalg.activations.Activation\r\n",
    "import org.nd4j.linalg.learning.config.Nesterovs\r\n",
    "import org.nd4j.linalg.lossfunctions.LossFunctions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Building a MultiLayerConfiguration"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "autoscroll": "auto"
   },
   "outputs": [],
   "source": [
    "val multiLayerConf: MultiLayerConfiguration = new NeuralNetConfiguration.Builder()\r\n",
    "  .seed(123).learningRate(0.1).iterations(1).optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT).updater(new Nesterovs(0.9)) //High Level Configuration\r\n",
    "  .list() //For configuring MultiLayerNetwork we call the list method\r\n",
    "  .layer(0, new DenseLayer.Builder().nIn(784).nOut(100).weightInit(WeightInit.XAVIER).activation(Activation.RELU).build()) //Configuring Layers\r\n",
    "  .layer(1, new OutputLayer.Builder().nIn(100).nOut(10).weightInit(WeightInit.XAVIER).activation(Activation.SIGMOID).build())\r\n",
    "  .pretrain(false).backprop(true) //Pretraining and Backprop Configuration\r\n",
    "  .build() //Building Configuration"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " \r\n",
    "\r\n",
    "### What we did here?\r\n",
    "---\r\n",
    "\r\n",
    "**- High Level Configuration**\r\n",
    "\r\n",
    "Function         | Details\r\n",
    "---------------- | -------------\r\n",
    "seed             | For keeping the network outputs reproducable during runs by initializing weights and other network randomizations through a seed\r\n",
    "learningRate     | For identifying the network learning rate\r\n",
    "iterations       | For identifying the number of optimization iterations\r\n",
    "optimizationAlgo | Optimization Algorithm to use for training. Run 'OptimizationAlgorithm.values().foreach { println }' to see different optimization algorithms that you can use.\r\n",
    "updater          | Algorithm to be used for updating the parameters\r\n",
    "\r\n",
    "---\r\n",
    "**- Configuration of Layers**\r\n",
    "\r\n",
    "Here we are calling list() to get the 'ListBuilder'. It provides us the necessary api to add layers to the network through the 'layer(arg1, arg2)' function.\r\n",
    "- The first parameter is the index of the position where the layer needs to be added.\r\n",
    "- The second parameter is the type of layer we need to add to the network.\r\n",
    "\r\n",
    "To build and add a layer we use a similar builder pattern as:\r\n",
    "\r\n",
    "Function         | Details\r\n",
    "---------------- | -------------\r\n",
    "nIn              | The number of inputs coming from the previous layer. (In the first layer, it represents the input it is going to take from the input layer)\r\n",
    "nOut             | The number of outputs it's going to send to the next layer. (For output layer it represents the labels here)\r\n",
    "weightInit       | The type of weights initialization to use for the layer parameters. Run 'WeightInit.values().foreach { println }' to see different weight initializations that you can use.\r\n",
    "activation       | The activation function between layers. Run 'Activation.values().foreach { println }' to see different activations that you can use.\r\n",
    "\r\n",
    "---\r\n",
    "**- Pretraining and Backprop Configuration**\r\n",
    "\r\n",
    "Function         | Details\r\n",
    "---------------- | -------------\r\n",
    "pretrain         | False if training from scratch\r\n",
    "backprop         | Whether to backprop or not\r\n",
    "\r\n",
    "---\r\n",
    "**- Building a Graph**\r\n",
    "\r\n",
    "Finally, the last build() call builds the configuration for us\r\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Reality checking for our MultiLayerConfiguration\n",
    "You can get your network configuration as String, JSON or YAML for reality checking.\n",
    "For JSON we can use the 'toJson()' function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "autoscroll": "auto"
   },
   "outputs": [],
   "source": [
    "println(multiLayerConf.toJson)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Creating a MultiLayerNetwork\n",
    "\n",
    "Finally, to create a 'MultiLayerNetwork', we pass the configuration to it as shown below"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "autoscroll": "auto"
   },
   "outputs": [],
   "source": [
    "val multiLayerNetwork : MultiLayerNetwork = new MultiLayerNetwork(multiLayerConf)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Building a ComputationGraphConfiguration"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "autoscroll": "auto"
   },
   "outputs": [],
   "source": [
    "val computationGraphConf : ComputationGraphConfiguration = new NeuralNetConfiguration.Builder()\r\n",
    "      .seed(123).learningRate(0.1).iterations(1).optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT).updater(new Nesterovs(0.9)) //High Level Configuration\r\n",
    "      .graphBuilder()  //For configuring ComputationGraph we call the graphBuilder method\r\n",
    "      .addInputs(\"input\") //Configuring Layers\r\n",
    "      .addLayer(\"L1\", new DenseLayer.Builder().nIn(3).nOut(4).build(), \"input\")\r\n",
    "      .addLayer(\"out1\", new OutputLayer.Builder().lossFunction(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD).nIn(4).nOut(3).build(), \"L1\")\r\n",
    "      .addLayer(\"out2\", new OutputLayer.Builder().lossFunction(LossFunctions.LossFunction.MSE).nIn(4).nOut(2).build(), \"L1\")\r\n",
    "      .setOutputs(\"out1\",\"out2\")\r\n",
    "      .pretrain(false).backprop(true) //Pretraining and Backprop Configuration\r\n",
    "      .build() //Building configuration"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### What we did here?\n",
    "---\n",
    "\n",
    "The only difference here is the way we are building layers.\n",
    "Instead of calling the 'list()' function, we call the 'graphBuilder()' to get a 'GraphBuilder' for building our 'ComputationGraphConfiguration'\n",
    "Following table explains what each function of a 'GraphBuilder' does\n",
    "\n",
    "---\n",
    "\n",
    "Function         | Details\n",
    "---------------- | -------------\n",
    "addInputs        | A list of strings telling the network what layers to use as input layers\n",
    "addLayer         | First parameter is the layer name, then the layer object and finally a list of strings defined previously to feed this layer as inputs\n",
    "setOutputs       | A list of strings telling the network what layers to use as output layers\n",
    "\n",
    "---\n",
    "\n",
    "The output layers defined here use another function 'lossFunction' to define what loss function to use. \n",
    "Use LossFunctions.LossFunction.values().foreach { println } to see what loss functions are available."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Reality checking for our ComputationGraphConfiguration\n",
    "You can get your network configuration as String, JSON or YAML for reality checking.\n",
    "For JSON we can use the 'toJson()' function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "autoscroll": "auto"
   },
   "outputs": [],
   "source": [
    "println(computationGraphConf.toJson)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Creating a ComputationGraph\n",
    "Finally, to create a 'ComputationGraph', we pass the configuration to it as shown below"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "autoscroll": "auto"
   },
   "outputs": [],
   "source": [
    "val computationGraph : ComputationGraph = new ComputationGraph(computationGraphConf)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### More MultiLayerConfiguration Examples\n",
    "---\n",
    "\n",
    "### 1. Regularization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "autoscroll": "auto"
   },
   "outputs": [],
   "source": [
    "//You can add regularization in the higher level configuration in the network through first allowing regularization through 'regularization(true)' and then chaining it to a regularization algorithm -> 'l1()', l2()' etc as shown below:\r\n",
    "new NeuralNetConfiguration.Builder().regularization(true).l2(1e-4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2. Dropout connects"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "autoscroll": "auto"
   },
   "outputs": [],
   "source": [
    "//When creating layers, you can add a dropout connection by using 'dropout(<dropOut_factor>)'\r\n",
    "new NeuralNetConfiguration.Builder()\r\n",
    "    .list() \r\n",
    "    .layer(0, new DenseLayer.Builder().dropOut(0.8).build())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " \n",
    "\n",
    "### 3. Bias initialization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "autoscroll": "auto"
   },
   "outputs": [],
   "source": [
    "//You can initialize the bias of a particular layer by using 'biasInit(<init_value>)'\n",
    "new NeuralNetConfiguration.Builder()\n",
    "    .list() \n",
    "    .layer(0, new DenseLayer.Builder().biasInit(0).build())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### More ComputationGraphConfiguration Examples\n",
    "---\n",
    "\n",
    "### 1. Recurrent Network with Skip Connections"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "autoscroll": "auto"
   },
   "outputs": [],
   "source": [
    "val cgConf1 : ComputationGraphConfiguration = new NeuralNetConfiguration.Builder()\r\n",
    "        .learningRate(0.01)\r\n",
    "        .graphBuilder()\r\n",
    "        .addInputs(\"input\") //can use any label for this\r\n",
    "        .addLayer(\"L1\", new GravesLSTM.Builder().nIn(5).nOut(5).build(), \"input\")\r\n",
    "        .addLayer(\"L2\",new RnnOutputLayer.Builder().nIn(5+5).nOut(5).build(), \"input\", \"L1\")\r\n",
    "        .setOutputs(\"L2\")\r\n",
    "        .build();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2. Multiple Inputs and Merge Vertex"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "autoscroll": "auto"
   },
   "outputs": [],
   "source": [
    "//Here MergeVertex concatenates the layer outputs\r\n",
    "val cgConf2 : ComputationGraphConfiguration = new NeuralNetConfiguration.Builder()\r\n",
    "        .learningRate(0.01)\r\n",
    "        .graphBuilder()\r\n",
    "        .addInputs(\"input1\", \"input2\")\r\n",
    "        .addLayer(\"L1\", new DenseLayer.Builder().nIn(3).nOut(4).build(), \"input1\")\r\n",
    "        .addLayer(\"L2\", new DenseLayer.Builder().nIn(3).nOut(4).build(), \"input2\")\r\n",
    "        .addVertex(\"merge\", new MergeVertex(), \"L1\", \"L2\")\r\n",
    "        .addLayer(\"out\", new OutputLayer.Builder().nIn(4+4).nOut(3).build(), \"merge\")\r\n",
    "        .setOutputs(\"out\")\r\n",
    "        .build();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3. Multi-Task Learning"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "autoscroll": "auto"
   },
   "outputs": [],
   "source": [
    "val cgConf3 : ComputationGraphConfiguration = new NeuralNetConfiguration.Builder()\r\n",
    "        .learningRate(0.01)\r\n",
    "        .graphBuilder()\r\n",
    "        .addInputs(\"input\")\r\n",
    "        .addLayer(\"L1\", new DenseLayer.Builder().nIn(3).nOut(4).build(), \"input\")\r\n",
    "        .addLayer(\"out1\", new OutputLayer.Builder()\r\n",
    "                .lossFunction(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)\r\n",
    "                .nIn(4).nOut(3).build(), \"L1\")\r\n",
    "        .addLayer(\"out2\", new OutputLayer.Builder()\r\n",
    "                .lossFunction(LossFunctions.LossFunction.MSE)\r\n",
    "                .nIn(4).nOut(2).build(), \"L1\")\r\n",
    "        .setOutputs(\"out1\",\"out2\")\r\n",
    "        .build();"
   ]
  }
 ],
 "metadata": {
  "zeppelin": {
   "name":"MultiLayerNetwork and ComputationGraph",
   "id":"2CVAHGVDC"
  },
  "kernelspec": {
   "display_name": "Spark 2.1.0 - Scala 2.11",
   "language": "scala",
   "name": "spark2-scala"
  },
  "language_info": {
   "codemirror_mode": "text/x-scala",
   "file_extension": ".scala",
   "mimetype": "text/x-scala",
   "name": "scala",
   "pygments_lexer": "scala",
   "version": "2.11.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
